Document Logical Structure Analysis Based on Perceptive Cycles
Identifieur interne : 001083 ( Main/Exploration ); précédent : 001082; suivant : 001084Document Logical Structure Analysis Based on Perceptive Cycles
Auteurs : Yves Rangoni [France] ; Abdel Belaïd [France]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2006.
Abstract
Abstract: This paper describes a Neural Network (NN) approach for logical document structure extraction. In this NN architecture, called Transparent Neural Network (TNN), the document structure is stretched along the layers, allowing an interpretation decomposition from physical (NN input) to logical (NN output) level. The intermediate layers represent successive interpretation steps. Each neuron is apparent and associated to a logical element. The recognition proceeds by repetitive perceptive cycles propagating the information through the layers. In case of low recognition rate, an enhancement is achieved by error backpropagation leading to correct or pick up a more adapted input feature subset. Several feature subsets are created using a modified filter method. The first experiments performed on scientific documents are encouraging.
Url:
DOI: 10.1007/11669487_11
Affiliations:
- France
- Alsace-Champagne-Ardenne-Lorraine, Région Lorraine
- Nancy, Vandœuvre-lès-Nancy
- Centre national de la recherche scientifique, Institut national de recherche en informatique et en automatique, Laboratoire lorrain de recherche en informatique et ses applications, Université de Lorraine
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000E02
- to stream Istex, to step Curation: 000D70
- to stream Istex, to step Checkpoint: 000A56
- to stream Main, to step Merge: 001100
- to stream Main, to step Curation: 001083
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Document Logical Structure Analysis Based on Perceptive Cycles</title>
<author><name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
</author>
<author><name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<affiliation><country>France</country>
<placeName><settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Alsace-Champagne-Ardenne-Lorraine</region>
<region type="region" nuts="2">Région Lorraine</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:941221242190433760FC0F9CDB3B864820FE5AF5</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11669487_11</idno>
<idno type="url">https://api.istex.fr/document/941221242190433760FC0F9CDB3B864820FE5AF5/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000E02</idno>
<idno type="wicri:Area/Istex/Curation">000D70</idno>
<idno type="wicri:Area/Istex/Checkpoint">000A56</idno>
<idno type="wicri:doubleKey">0302-9743:2006:Rangoni Y:document:logical:structure</idno>
<idno type="wicri:Area/Main/Merge">001100</idno>
<idno type="wicri:Area/Main/Curation">001083</idno>
<idno type="wicri:Area/Main/Exploration">001083</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Document Logical Structure Analysis Based on Perceptive Cycles</title>
<author><name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
<affiliation wicri:level="1"><country xml:lang="fr">France</country>
<wicri:regionArea>Loria Research Center – Read Group, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName><settlement type="city">Vandœuvre-lès-Nancy</settlement>
<settlement type="city" wicri:auto="agglo">Nancy</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">France</country>
</affiliation>
</author>
<author><name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<affiliation wicri:level="1"><country xml:lang="fr">France</country>
<wicri:regionArea>Loria Research Center – Read Group, Vandœuvre-lès-Nancy</wicri:regionArea>
<placeName><settlement type="city">Vandœuvre-lès-Nancy</settlement>
<settlement type="city" wicri:auto="agglo">Nancy</settlement>
</placeName>
<placeName><settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Alsace-Champagne-Ardenne-Lorraine</region>
<region type="region" nuts="2">Région Lorraine</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">France</country>
<placeName><settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Alsace-Champagne-Ardenne-Lorraine</region>
<region type="region" nuts="2">Région Lorraine</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2006</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">941221242190433760FC0F9CDB3B864820FE5AF5</idno>
<idno type="DOI">10.1007/11669487_11</idno>
<idno type="ChapterID">11</idno>
<idno type="ChapterID">Chap11</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: This paper describes a Neural Network (NN) approach for logical document structure extraction. In this NN architecture, called Transparent Neural Network (TNN), the document structure is stretched along the layers, allowing an interpretation decomposition from physical (NN input) to logical (NN output) level. The intermediate layers represent successive interpretation steps. Each neuron is apparent and associated to a logical element. The recognition proceeds by repetitive perceptive cycles propagating the information through the layers. In case of low recognition rate, an enhancement is achieved by error backpropagation leading to correct or pick up a more adapted input feature subset. Several feature subsets are created using a modified filter method. The first experiments performed on scientific documents are encouraging.</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
</country>
<region><li>Alsace-Champagne-Ardenne-Lorraine</li>
<li>Région Lorraine</li>
</region>
<settlement><li>Nancy</li>
<li>Vandœuvre-lès-Nancy</li>
</settlement>
<orgName><li>Centre national de la recherche scientifique</li>
<li>Institut national de recherche en informatique et en automatique</li>
<li>Laboratoire lorrain de recherche en informatique et ses applications</li>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree><country name="France"><noRegion><name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
</noRegion>
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001083 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001083 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:941221242190433760FC0F9CDB3B864820FE5AF5 |texte= Document Logical Structure Analysis Based on Perceptive Cycles }}
This area was generated with Dilib version V0.6.32. |